mackey-glass system
Q-S5: Towards Quantized State Space Models
Abreu, Steven, Pedersen, Jens E., Heckel, Kade M., Pierro, Alessandro
In the quest for next-generation sequence modeling architectures, State Space Models (SSMs) have emerged as a potent alternative to transformers, particularly for their computational efficiency and suitability for dynamical systems. This paper investigates the effect of quantization on the S5 model to understand its impact on model performance and to facilitate its deployment to edge and resource-constrained platforms. Using quantization-aware training (QAT) and post-training quantization (PTQ), we systematically evaluate the quantization sensitivity of SSMs across different tasks like dynamical systems modeling, Sequential MNIST (sMNIST) and most of the Long Range Arena (LRA). We present fully quantized S5 models whose test accuracy drops less than 1% on sMNIST and most of the LRA. We find that performance on most tasks degrades significantly for recurrent weights below 8-bit precision, but that other components can be compressed further without significant loss of performance. Our results further show that PTQ only performs well on language-based LRA tasks whereas all others require QAT. Our investigation provides necessary insights for the continued development of efficient and hardware-optimized SSMs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
Learn one size to infer all: Exploiting translational symmetries in delay-dynamical and spatio-temporal systems using scalable neural networks
Goldmann, Mirko, Mirasso, Claudio R., Fischer, Ingo, Soriano, Miguel C.
We design scalable neural networks adapted to translational symmetries in dynamical systems, capable of inferring untrained high-dimensional dynamics for different system sizes. We train these networks to predict the dynamics of delay-dynamical and spatio-temporal systems for a single size. Then, we drive the networks by their own predictions. We demonstrate that by scaling the size of the trained network, we can predict the complex dynamics for larger or smaller system sizes. Thus, the network learns from a single example and, by exploiting symmetry properties, infers entire bifurcation diagrams.
- Europe > Spain > Balearic Islands > Mallorca > Palma (0.04)
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Bonn (0.04)
- Asia > Singapore (0.04)
Rapid Time Series Prediction with a Hardware-Based Reservoir Computer
Canaday, Daniel, Griffith, Aaron, Gauthier, Daniel
There is considerable interest in the machine learning community in using recurrent neural networks (RNN) for processing time-dependent signals. Many machine learning and artificial intelligence tasks, such as dynamical system modeling, human speech recognition, and natural languageprocessing are intrinsically time-dependent tasks, and thus are more naturally handled within a timedependent, neural-networkframework. Though they have high expressive power, RNNs are difficult to train using gradient-descent-based methods. One approach to efficiently and rapidly train an RNN is known as reservoir computing (RC). In RC, the network isdivided into input nodes, a bulk collection of nodes known as the reservoir, and output nodes, such that the only recurrent links are between reservoir nodes.Training involves only adjusting the weights along links connecting the reservoir to the output nodes and not the recurrent links in the reservoir. Recently, implementations of reservoir computing using dedicatedhardware have achieved much attention, particularlythose based on delay-coupled photonic systems. These devices allow for reservoir computing at extremely high speeds, including the classification of spoken words at a rate of millions of words per second. There is also the potential to form the input and output layersout of optics as well, resulting in an all-optical computational device. Reservoir computing is a neural network approach for processing time-dependent signals that has seen rapid development in recent years. Physical implementations of the technique using optical reservoirs have demonstrated remarkable accuracy and processing speed at benchmark tasks. However, these approaches require an electronic output layer to maintain high performance, which limits their use in tasks such as time-series prediction, where the output is fed back into the reservoir. We present here a reservoir computing scheme that has rapid processing speed both by the reservoir and the output layer.
- North America > United States > Ohio (0.14)
- Europe > Germany (0.14)